Cross-Project Defect Prediction with Metrics Selection and Balancing Approach

نویسندگان

چکیده

Abstract In software development, defects influence the quality and cost in an undesirable way. Software defect prediction (SDP) is one of techniques which improves testing efficiency by early identification defects(bug/fault/error). Thus, several experiments have been suggested for (DP) techniques. Mainly DP method utilises historical project data constructing models. SDP performs well within projects until there adequate amount accessible to train However, if are inadequate or limited same project, researchers mainly use Cross-Project Defect Prediction (CPDP). CPDP a possible alternative option that refers anticipating using models built on from other projects. challenging due its distribution domain difference problem. The proposed framework effective two-stage approach CPDP, i.e., model generation process. phase, conglomeration different pre-processing, including feature selection class reweights technique, used improve initial quality. Finally, fine-tuned efficient bagging boosting based hybrid ensemble developed, avoids over -fitting/under-fitting helps enhance performance. process generated predicts projects, has clean. evaluated using25 obtained public repositories. result analysis shows achieved 0.71±0.03 f1-score, significantly state-of-the-art approaches 23 % 60 %.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Benchmarking cross-project defect prediction approaches with costs metrics

Defect prediction can be a powerful tool to guide the use of quality assurance resources. In recent years, many researchers focused on the problem of Cross-Project Defect Prediction (CPDP), i.e., the creation of prediction models based on training data from other projects. However, only few of the published papers evaluate the cost efficiency of predictions, i.e., if they save costs if they are...

متن کامل

Cross-project defect prediction

Prediction of software defects works well within projects as long as there is a sufficient amount of data available to train any models. However, this is rarely the case for new software projects and for many companies. So far, only a few have studies focused on transferring prediction models from one project to another. In this paper, we study cross-project defect prediction models on a large ...

متن کامل

Towards Cross-Project Defect Prediction with Imbalanced Feature Sets

Cross-project defect prediction (CPDP) has been deemed as an emerging technology of software quality assurance, especially in new or inactive projects, and a few improved methods have been proposed to support better defect prediction. However, the regular CPDP always assumes that the features of training and test data are all identical. Hence, very little is known about whether the method for C...

متن کامل

TDSelector: A Training Data Selection Method for Cross-Project Defect Prediction

Context: In recent years, cross-project defect prediction (CPDP) attracted much attention and has been validated as a feasible way to address the problem of local data sparsity in newly created or inactive software projects. Unfortunately, the performance of CPDP is usually poor, and low quality training data selection has been regarded as a major obstacle to achieving better prediction results...

متن کامل

Comparison of local classifiers for cross-project defect prediction

There is a connection between static source code metrics, for example, lines of code or cyclomatic complexity and potential defects in the source code. Obviously, there is no closed formula, but with the field of machine learning and its techniques we have a tool at our disposal that has the ability to infer rules from large amounts of data. In this thesis, we use machine learning techniques to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied Computer Systems

سال: 2022

ISSN: ['2255-8691', '2255-8683']

DOI: https://doi.org/10.2478/acss-2022-0015